46 research outputs found

    A Lower Bound for the Optimization of Finite Sums

    Full text link
    This paper presents a lower bound for optimizing a finite sum of nn functions, where each function is LL-smooth and the sum is μ\mu-strongly convex. We show that no algorithm can reach an error ϵ\epsilon in minimizing all functions from this class in fewer than Ω(n+n(κ1)log(1/ϵ))\Omega(n + \sqrt{n(\kappa-1)}\log(1/\epsilon)) iterations, where κ=L/μ\kappa=L/\mu is a surrogate condition number. We then compare this lower bound to upper bounds for recently developed methods specializing to this setting. When the functions involved in this sum are not arbitrary, but based on i.i.d. random data, then we further contrast these complexity results with those for optimal first-order methods to directly optimize the sum. The conclusion we draw is that a lot of caution is necessary for an accurate comparison, and identify machine learning scenarios where the new methods help computationally.Comment: Added an erratum, we are currently working on extending the result to randomized algorithm

    Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need

    Full text link
    Self-Supervised Learning (SSL) has emerged as the solution of choice to learn transferable representations from unlabeled data. However, SSL requires to build samples that are known to be semantically akin, i.e. positive views. Requiring such knowledge is the main limitation of SSL and is often tackled by ad-hoc strategies e.g. applying known data-augmentations to the same input. In this work, we generalize and formalize this principle through Positive Active Learning (PAL) where an oracle queries semantic relationships between samples. PAL achieves three main objectives. First, it unveils a theoretically grounded learning framework beyond SSL, that can be extended to tackle supervised and semi-supervised learning depending on the employed oracle. Second, it provides a consistent algorithm to embed a priori knowledge, e.g. some observed labels, into any SSL losses without any change in the training pipeline. Third, it provides a proper active learning framework yielding low-cost solutions to annotate datasets, arguably bringing the gap between theory and practice of active learning that is based on simple-to-answer-by-non-experts queries of semantic relationships between inputs.Comment: 8 main pages, 20 totals, 10 figure
    corecore